25 research outputs found

    A abordagem POESIA para a integração de dados e serviços na Web semantica

    Get PDF
    Orientador: Claudia Bauzer MedeirosTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: POESIA (Processes for Open-Ended Systems for lnformation Analysis), a abordagem proposta neste trabalho, visa a construção de processos complexos envolvendo integração e análise de dados de diversas fontes, particularmente em aplicações científicas. A abordagem é centrada em dois tipos de mecanismos da Web semântica: workflows científicos, para especificar e compor serviços Web; e ontologias de domínio, para viabilizar a interoperabilidade e o gerenciamento semânticos dos dados e processos. As principais contribuições desta tese são: (i) um arcabouço teórico para a descrição, localização e composição de dados e serviços na Web, com regras para verificar a consistência semântica de composições desses recursos; (ii) métodos baseados em ontologias de domínio para auxiliar a integração de dados e estimar a proveniência de dados em processos cooperativos na Web; (iii) implementação e validação parcial das propostas, em urna aplicação real no domínio de planejamento agrícola, analisando os benefícios e as limitações de eficiência e escalabilidade da tecnologia atual da Web semântica, face a grandes volumes de dadosAbstract: POESIA (Processes for Open-Ended Systems for Information Analysis), the approach proposed in this work, supports the construction of complex processes that involve the integration and analysis of data from several sources, particularly in scientific applications. This approach is centered in two types of semantic Web mechanisms: scientific workflows, to specify and compose Web services; and domain ontologies, to enable semantic interoperability and management of data and processes. The main contributions of this thesis are: (i) a theoretical framework to describe, discover and compose data and services on the Web, inc1uding mIes to check the semantic consistency of resource compositions; (ii) ontology-based methods to help data integration and estimate data provenance in cooperative processes on the Web; (iii) partial implementation and validation of the proposal, in a real application for the domain of agricultural planning, analyzing the benefits and scalability problems of the current semantic Web technology, when faced with large volumes of dataDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

    Busca em subespaços em varias dimensões

    Get PDF
    Orientador: Pedro Jussieu de RezendeDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Ciencia da ComputaçãoResumo: o tema central deste trabalho é a pesquisa de soluções para problemas de busca em subespaços (range search), sob o enfoque de projeto de algoritmos eficientes e geometria computacional, considerando objetos de dados em forma de pontos dispersos num espaço multidimensional e explorando diversos formatos de subespaços de busca encontrados na literatura. O objetivo é reunir diversas formulações e métodos de solução em um compêndio, onde estes são descritos sob uma mesma ótica, com notação uniforme e de forma mais simples que nos textos originais, de modo a facilitar um estudo mais detalhado e comparações, no que diz respeito à natureza e ao funcionamento das soluções. Pretende-se com isso tornar as idéias provenientes da pesquisa atualmente em processo na área de algoritmos acessíveis de forma mais integrada e simples, tanto aos interessados na pesquisa de métodos mais eficientes e adequados para problemas em teoria da computação, quanto àqueles mais interessados na aplicação dessas idéias. Um estudo abrangente das soluções encontradas na literatura permite perceber diversas semelhanças de concepção nos métodos empregados. Freqüentemente, pode-se observar a ocorrência de abordagens e técnicas comuns em diversas situações. A estas abordagens e técnicas de aplicação geral atribuímos o nome de paradigmas de algoritmos. O estudo e a utilização de paradigmas de algoritmos possibilitam um certo grau de sistematização das soluções de problemas de busca em subespaços, uma vez que eles permitem encarar diversas soluções distintas, de diversas variações do problema como manifestações de um mesmo fundamento racional. Alem disso, o estudo de paradigmas é instrutivo, pois promove o desenvolvimento de raciocínios sistemáticos, aplicáveis na resolução de diversos problemas em computação. A divisão do conteúdo é efetuada de maneira a fornecer primeiro o fundamento: teórico, necessário à compreensão dos métodos de solução, que são tratados posteriormente. No capítulo 1, são fornecidos os conceitos e classificações básicos, relativos a problemas de busca em geral e particularmente busca em subespaços, a fim de prover uma fundamentação teórica e situar a área de estudo.. No capítulo 2, são descritos alguns paradigmas de algoritmos aplicados a problemas de busca em subespaços, com o intuito de prover ao leitor maneiras alternativaS de relacionar as soluções apresentadas posteriormente, induzindo-o a desenvolver raciocínios que lhe habilitem a perceber os fundamentos e técnicas em comum. Nos capítulos 3 a 6, são abordados os sub.problemas caracterizados pelos formatos clássicos de subespaços de busca encontrados na literatura, ordenados da maneira que parece mais conveniente e que reflete a complexidade das soluções, a natureza das mesmas e sua evolução histórica. Em cada um destes capítulos, os sub-problemas são discutidos em detalhes, algumas soluções e limites inferiores são descritos superficialmente e há uma seção de notas bibliográficas, com referências para assuntos específicos. Finalmente, no capítulo 7, são sintetizadas as contribuições do trabalho e relacionados alguns assuntos para possíveis extensões no futuro.Abstract: The main, objective of this work is the study of solutions found in the literature to range search, from the view point of algorithm design and computational geometry, considering only data objects; in the form of points embedded1 in a multidimensional space, and investigating various shapes of ranges. Several formulations and solutions to range search problems are surveyed. These are described under one abstract view, with uniform notation and in a form hopefully clearer than, the original sources, in such way that comparisons of the nature and functionality of the solutions and more detailed studies may be facilitated. Our purpose is to make the ideas deriving from the research on range search available in a more integrated and simpler way, to people interested in the discovery of more suitable and. efficient methods for problems in theoretical computer science as well as to those interested in the applications of these ideas. A wide study of the solutions found in the literature shows many conceptual similarities in the employed methods. Frequently, the same approaches and' techniques are seen in distinct situations. These general purpose approaches and techniques are called "algorithm paradigms". The study and application of these paradigms allow a certain level of generalization of the solutions to range search problems, because they allow one to perceive several solutions of vario1ls instances of a general problem as the manifestation of the same rationale. The study of algorithm paradigms is instructive in its own right, since it propitiates the development of systematic reasoning, useful in the solution of many problems in computer science. The contents herein are arranged so as to first give the theoretical basis necessary to understanding the methods given later. In chapter 1, we provide the basic concepts and classifications related to search problems in general and to range search in particular, and establish the scope of our research. In chapter 2, we describe some algorithm paradigms applied to range search problems, with the purpose of supplying the reader with alternative ways of establishing connections among the solutions presented later leading him to develop a reasoning that allows the identification of the fundamentals and techniques shared by tile sol1itions. In, chapters 3 to 6, we deal with the variations of' the range search problem characterized by the classical shapes of ranges considered in the literature. These chapters are arranged in a convenient way in order to reflect the complexity ofthe discussed solutions, their nature and the historical evolution. In each one of these chapters the problems are discussed in detail, some solutions and lower bounds are briefly described and bibliographic notes containing references to specific subjects are presented. Finally, in chapter 7, we summarize the contributions of this work and extensions that can be undertaken in the future.MestradoMestre em Ciência da Computaçã

    SMoT+: Extending the SMoT Algorithm for Discovering Stops in Nested Sites

    Get PDF
    Several methods have been proposed to analyse trajectory data. However, a few of these methods consider trajectory relations with relevant features of the geographic space. One of the best-known methods that take into account the geographical regions crossed by a trajectory is the SMoT algorithm. Nevertheless, SMoT considers only disjoint geographic regions that a trajectory may traverse, while many regions of interest are contained in other regions. In this article, we extend the SMoT algorithm for discovering stops in nested regions. The proposed algorithm, called SMoT+, takes advantage of information about the hierarchy of nested regions to efficiently discover the stops in regions at different levels of this hierarchy. Experiments with real data show that SMoT+ detects stops in nested regions, which are not detected by the original SMoT algorithm, with minor growth of processing time

    STING Report: convenient web-based application for graphic and tabular presentations of protein sequence, structure and function descriptors from the STING database

    Get PDF
    The Sting Report is a versatile web-based application for extraction and presentation of detailed information about any individual amino acid of a protein structure stored in the STING Database. The extracted information is presented as a series of GIF images and tables, containing the values of up to 125 sequence/structure/function descriptors/parameters. The GIF images are generated by the Gold STING modules. The HTML page resulting from the STING Report query can be printed and, most importantly, it can be composed and visualized on a computer platform with an elementary configuration. Using the STING Report, a user can generate a collection of customized reports for amino acids of specific interest. Such a collection comes as an ideal match for a demand for the rapid and detailed consultation and documentation of data about structure/function. The inclusion of information generated with STING Report in a research report or even a textbook, allows for the increased density of its contents. STING Report is freely accessible within the Gold STING Suite at http://www.cbi.cnptia.embrapa.br, http://www.es.embnet.org/SMS/, http://gibk26.bse.kyutech.ac.jp/SMS/ and http://trantor.bioc.columbia.edu/SMS (option: STING Report)

    The Diamond STING server

    Get PDF
    Diamond STING is a new version of the STING suite of programs for a comprehensive analysis of a relationship between protein sequence, structure, function and stability. We have added a number of new functionalities by both providing more structure parameters to the STING Database and by improving/expanding the interface for enhanced data handling. The integration among the STING components has also been improved. A new key feature is the ability of the STING server to handle local files containing protein structures (either modeled or not yet deposited to the Protein Data Bank) so that they can be used by the principal STING components: (Java)Protein Dossier ((J)PD) and STING Report. The current capabilities of the new STING version and a couple of biologically relevant applications are described here. We have provided an example where Diamond STING identifies the active site amino acids and folding essential amino acids (both previously determined by experiments) by filtering out all but those residues by selecting the numerical values/ranges for a set of corresponding parameters. This is the fundamental step toward a more interesting endeavor—the prediction of such residues. Diamond STING is freely accessible at and

    A Survey on Information Systems Interoperability

    No full text
    The interoperability of information systems has been pursued for a long time and is even more demanded in the Internet era. This paper reviews the literature in this area, from the database perspective. It covers work on interconnection of databases, classification of data integration problems, major standards and architectures, and the most recent developments in the fields of semantic Web, Web services and scientific workflows

    Aplicando Ontologias de Objetos Geográficos para Facilitar Navegação em GIS

    No full text
    The Semantic Web has become an active research area with many promising applications. This paper gives a concrete contribution to the adoption of Semantic Web technology in GIS, by describing the use of a domain ontology to help navigation on maps, and support the integration of geographic objects on the Web. The OntoCarta system, which we are developing to demonstrate our methods, relies on current standards and public domain tools to build a map navigator including: (1) a viewer for maps in different scales; (2) a domain ontology to describe and correlate maps objects. The combination of these components results in a knowledge directed cartographic navigation system. This system supports map zooming, while keeping contextual information for different levels of abstraction. The adoption of open formats to represent the domain ontology, allied to the consensual character of this ontology, enables the use of OntoCarta on a Web browser and fosters data reuse throughout the Internet

    MSC+: Language pattern learning for word sense induction and disambiguation

    No full text
    Identifying the correct meaning of words in context or discovering new word senses is particularly useful for several tasks such as question answering, information extraction, information retrieval, and text summarization. However, specially in the context of user-generated contents and on-line communication (e.g. Twitter), new meanings are continuously crafted by speakers as the result of existing words being used in novel contexts. Consequently, lexical semantics inventories and systems have difficulties to cope with semantic drifting problems. In this work, we propose an approach to induce and disambiguate word senses of some target words in collections of short texts, such as tweets, through the use of fuzzy lexico-semantic patterns that we define as sequences of Morpho-semantic Components (MSC+). We learn these patterns, that we call patterns, from text data automatically. Experimental results show that instances of some patterns arise in a number of tweets, but sometimes using different words to convey the sense of the respective MSC+ in some tweets where pattern instances appear. The exploitation of MSC+ patterns when they induce semantics on target words enable effective word sense disambiguation mechanisms leading to improvements in the state of the art.This work was conducted during a doctorate partially supported by grants of CAPES (Brazilian Coordination of Superior Level Staff Improvement) a research support agency from the Ministry of Education of Brazil. CAPES also supported an internship for international cooperation with the TALN (Natural Language Processing Research Group) at the Pompeu Fabra University in Barcelona, Spain. The last author acknowledges support from the Spanish Government under the María de Maeztu Units of Excellence Programme (MDM-2015-0502)

    Automatically tailoring semantics-enabled dimensions for movement data warehouses

    No full text
    This paper proposes an automatic approach to build tailored dimensions for movement data warehouses based on views of existing hierarchies of objects (and their respective classes) used to semantically annotate movement segments. It selects the objects (classes) that annotate at least a given number of segments of a movement dataset to delineate hierarchy views for deriving tailored analysis dimensions for that movement dataset. Dimensions produced in this way can be quite smaller than the hierarchies from which they are extracted, leading to efficiency gains, among other potential benefits. Results of experiments with tweets semantically enriched with points of interest taken from linked open data collections show the viability of the proposed approach
    corecore